Metadata in JPEG 2000 file format

ABSTRACT

A system for including metadata with a JPEG2000 file.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. ProvisionalApplication No. 60/214,878, filed Jun. 28, 2000.

BACKGROUND OF THE INVENTION

[0002] The present invention relates to embedding data in a JPEG2000file format.

[0003] At the core of the JPEG2000 structure is a wavelet basedcompression methodology that provides for a number of benefits over theprevious Discrete Cosine Transformation (DCT) compression methods usedin the existing JPEG format. Essentially, wavelets are mathematicalexpressions that encode the image in a continuous stream; therebyavoiding the tendency toward visible artifacts that can sometimes resultfrom DCT's division of an image into discrete compression blocks.

[0004] JPEG2000 wavelet technology can provide as much as a 20%improvement in compression efficiency over existing JPEG DCT compressionmethods. JPEG2000 wavelet technology also provides for both lossy andlossless compression, as opposed to the lossy technique used in theoriginal JPEG, which can lead to image degradation at high compressionlevels. In addition, because the JPEG2000 format includes much richercontent than existing JPEG files, the bottom line effect is the abilityto deliver a Flashpix-level of information in a compressed image filethat is 20% smaller than baseline JPEG and roughly 40% smaller than anequivalent Flashpix file.

[0005] Another inherent benefit of JPEG2000's use of wavelet technologyis the ability to progressively access the encoded image in a smoothcontinuous fashion without having to download, decode, and/or print theentire file. In a way this allows for a virtual file system within theimage file that can be flexibly arranged by the image providers to bestsuit the way that their users will need to access the information. Forinstance a “progressive-by-resolution” structure would allow the imageinformation to stream to the user by starting with a low-resolutionversion and then progressively adding higher resolution as required. Onthe other hand, a “progressive-by-quality” structure might begin with afull resolution version but with minimal color data per pixel and thenprogressively add more bits per pixel as required.

[0006] Referring to FIG. 1, a conforming file for the JPEG2000 standardis typically described as a sequence of boxes, some of which containother boxes. An actual file need not contain all of the boxes shown inFIG. 1, may contain different counts of the boxes, and/or could use theboxes in different positions in the file. A more complete description ofthe contents of these boxes is discussed in JPEG2000 Image CodingSystem: Compound Image File Format, JPEG2000 Part VI committee Draft, 9,March 2001. Schematically, the hierarchical organization of boxes in aJPEG2000 file is shown in FIG. 2. Boxes with dashed borders are optionalin conforming JPEG2000 files. However, an optional box may definemandatory boxes within that optional box. In this case, if the optionalbox exists, the mandatory boxes within the optional box normally exist.FIG. 2 illustrates only the containment relationship between the boxesin the file. A particular order of those boxes in the file is notgenerally implied. Referring to FIGS. 3A-3D, a list of exemplary boxesthat may be used in a JPEG2000 file are illustrated.

[0007] A JPEG2000 file may contain metadata boxes with intellectualproperty right information or vendor specific information. In thismanner the JPEG200 file may be annotated with intellectual propertyrights information. In particular, the metadata will normally providethe ability to include copyright information, such as the propercopyright ownership of image files. This helps alleviate long heldconcerns regarding the unauthorized appropriation of image files withoutthe copyright owners consent. In this manner, at least the copyrightinformation will be provided together with the JPEG2000 file and theimage described therein.

[0008] A JPEG2000 file may also include a UUID (universal uniqueidentifier) box that contains vendor specific information. There may bemultiple UUID boxes within the file. The UUID box is intended to provideadditional vendor specific information for particularized applications,which would normally reflect information regarding the rendering orusage of the image contained within the file. However, the content to beprovided within the UUID box is undefined.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009]FIG. 1 illustrates JPEG2000 file elements and structure.

[0010]FIG. 2 illustrates conceptual structure of a JPEG2000 file.

[0011] FIGS. 3A-3D describe boxes used in a JPEG2000 file.

[0012]FIG. 4 illustrates a metadata box of a JPEG2000 file.

[0013]FIG. 5 illustrates a UUID box of a JPEG2000 file.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

[0014] What may be observed from the file format used for JPEG2000 isthat nearly all the boxes contain data relevant to the rendering of theimage itself, which is what would be expected from an image file formatfor a particular type of image, such as JPEG2000. Further, the imagefile format has been extended to include copyright information which islikewise of particular interest for the creator of the document. Afterconsideration of the JPEG2000 file format and the constructs provided bythe JPEG2000 file format, the present inventors came to the startlingrealization that the previously recognized uses of the JPEG2000 metadatabox may be extended, while not extending or otherwise modifying the fileformat in a non-compliant manner, to include data that is representativeof a description of the content depicted by the JPEG2000 file or tootherwise provide interactivity with the rendered image. The JPEG200file format was intended to be a self-contained image description formatfor the rendering of an image and was not intended to support adescription of the content of the image nor provide interactivity withthe rendered image. Normally, if additional interactivity is desired foran image the file format is extended in a proprietary manner orotherwise an additional program is provided which provides such adescription of the content, such as a database, and interactivity withthe rendered image, such as animation and game software. Preferably, thecontent of the metadata box does not change the visual appearance of theimage.

[0015] Referring to FIG. 4, for example the metadata box may containinformation regarding links to additional information, voiceannotations, textual information describing the content of the image,hot spots, and object boundary information regarding objects within theimage itself. Further, the textual information may relate to, forexample, the title, the category, keywords, date of creation, time ofcreation, etc. In this manner, the textual information describes thecontent of the image to be rendered by a suitable JPEG2000 viewer but istypically free from changing the rendered image. In addition, thisinformation is provided within the constructs of the JPEG2000 fileformat in a compliant manner so that all compliant JPEG2000 viewers willbe able to render the image in a proper manner and in addition processthe additional information, if desired. It is to be understood that themetadata box is preferably in XML format, however, any format may beused, if desired.

[0016] Referring to FIG. 5, after realizing the potential extension tothe JPEG2000 file format the present inventors likewise determined thatthe UUID box may contain information regarding links to additionalinformation, voice annotations, textual information describing thecontent of the image (e.g., actor, theme, genre, location, etc.), hotspots, and object boundary information regarding objects within theimage itself. Further, the textual information may relate to, forexample, the title, the category, keywords, date of creation, time ofcreation, etc. In this manner, the textual information describes thecontent of the image to be rendered by a suitable JPEG2000 viewer but istypically free from changing the rendered image. In addition, thisinformation is provided within the constructs of the JPEG2000 fileformat in a compliant manner so that all compliant JPEG2000 viewers willbe able to render the image in a proper manner and in addition processthe additional information, if desired. It is to be understood that theUUID box is preferably in XML format, however, any format may be used,as desired.

[0017] MPEG-7 is a description scheme that, at least in part, provides adescription of the content of video, such as actor, genre, etc. WhileMPEG-7 was specifically designed to relate to video content, the presentinventors came to the realization that this video based scheme may beused for describing the content of an image file, namely JPEG2000 files,preferably in a compliant manner. Further, JPEG2000 specification doesnot define the syntax and semantics for the metadata that can be placedin the metadata and/or UUID boxes in the file format. Therefore, a needexists for the specification of the syntax and semantics for thecontents of these boxes, preferably in a standardized syntax andsemantics specification that will permit the exchangeability of themetadata contents contained in these boxes. Referring to FIGS. 4 and 5,the present inventors came to the further realization that at least aportion of the MPEG-7 description schemes describing video content maybe suitable for use within the metadata boxes and/or UUID boxes of theJPEG2000 file format. This unlikely combination of file formats, namelyJPEG2000 for image files and MPEG-7 describing video content, providesadvantageous multi-standard interoperability. MPEG-7 is described inMPEG-7 Multimedia Description Schemes, Experimentation Model (XM) V 3.0,N3410, Geneva, May 2000; MPEG-7 Multimedia Description Schemes, WorkingDraft (WD) V. 3.0, N3411, Geneva, May 2000; MPEG-7 DescriptionDefinition Language (DDL) WD 3.0, N3391, Geneva, May 2000; MPEG-7 VisualPart of XM 6.0, N3398, Geneva, May 2000; MPEG-7 Visual Part, WorkingDraft (WD) V. 3.0, N3399, Geneva, May 2000; all of which is incorporatedby reference herein.

[0018] While the combination of MEPG-7 and JPEG2000 is a desirable goal,the resulting file is preferably self-contained, in that all of the datanecessary to render the image is contained within the file format. Inthe same manner, preferably the metadata or UUID information include thebinary data necessary to execute or otherwise cause the desired activityto be carried out. In contrast to the execution of binary code, MPEG-7was designed to provide a description of the content of the video mediaand accordingly lacked suitable constructs for embedding binary datawith the information. After the determination of the need for embeddingbinary data within an MPEG-7 description scheme, especially suitable forproviding metadata or UUID data within a JPEG2000 file format, thepresent inventors modified the previously existing MPEG-7 standard toinclude a suitable technique for including binary data, which was notpreviously considered to have any value.

[0019] A new description scheme was been developed, namely,“InlineMedia” that permits the identification of the format of the mediastream, such as for example, indicated by a MediaFormat DescriptionScheme or a FileFormat (MIME-type) identifier. The audio and/or visualmaterial contained in an InlineMedia description may be either essencedata or audio and/or visual data representing other essence data,depending on its context. The InlineMedia enables the description ofaudio and/or visual data located within the description itself, withouthaving to refer to a location external to the description.

[0020] The InlineMedia syntax may be as follows: <!--######################################## --> <!-- Definition ofInlineMedia Datatype --> <!-- #########################################--> <complexType name=“InlineMediaType”> <choice> <elementname=“MediaData16”> <simpeType> <restriction base=“binary”> <encodingvalue=“hex”> </restriction> </simpleType> </element> <elementname=“MediaData64”> <simpleType> <restriction base=“binary”> <encodingvalue=“base64”/> </restriction> </simpleType> </element> </choice><attribute name=“type” type=“mpeg7:mimeType” use=“required”/></complexType>

[0021] It is noted that<!—is the start of a comment while —>is the endof a comment. Likewise choice provides a set of options, with the firstoption being binary data encoded in base 16 and the second option beingbinary data encoded in base 64. Other bases may likewise be used, asdesired. The attribute name indicates the data type, such as MPEG dataand their format, and whether this attribute is included in thedescription.

Summary of InlineMediaType

[0022] InlineMediaType A descriptor for specifying media data embeddedin the description.

[0023] MediaData 16 Specifies binary media data encoded as a textualstring in base-16 format.

[0024] MediaData64 Specifies binary media data encoded as a textualstring in base-64 format.

[0025] Type Specifies the MIME type of media data. InlineMedia Example<myInlineMedia type=“image/jpeg”><MediaData16>98A34F10C5094538AB93873262522DA3</MediaData16></myInlineMedia>

[0026] The binary code embedded within the InlineMedia may be, forexample, executable code, audio segments, video segments, and stillimages.

[0027] The InlineMedia descriptor is preferably included within theMediaLocation specification in MPEG-7, by modification of theMediaLocator specification. The MediaFormat syntax may be as follows:<!-- ######################################## --> <!-- Definition theMedia Format DS --> <!-- ######################################### --><complexType name=“MediaFormat”> <element name=“FileFormat”type=“mds:ControlledTerm”/> <element name=“System”type=“mds:ControlledTerm” minOccurs=“0”/> <element name=“Medium”type=“mds:ControlledTerm” minOccurs=“0”/> <element name=“Color”type=“mds:ControlledTerm” minOccurs=“0”/> <element name=“Sound”type=“mds:ControlledTerm” minOccurs=“0”/> <element name=“FileSize”type=“nonNegativeInteger” minOccurs=“0”/> <element name=“Length”type=“mds:TimePoint” minOccurs=“0”/> <element name=“AudioChannels”type=“nonNegativeInteger” minOccurs=“0”/> <element name=“AudioLanguage”type=“language” minOccurs=“0”/> <element name=“id” type=“ID”/></complexType>

Summary of MediaFormat

[0028] MediaFormat Description of the storage format of the media. idIdentification of the instance of the media format description.FileFormat The file format or MIME type of the audio and/or videocontent instance. System The video system of the audio and/or videocontent (e.g., PAL, NTSC). Medium The video system of the audio and/orvideo content is sotred (e.g., tape, CD, DVD). Color The color domain ofthe audio and/or video content (e.g., color, black/white, colored).Sound The sound domain of the audio and/or video content (e.g., nosound, stereo, mono, dual, surround, 5.1, dolby digital). FileSize Thesize, in byte for example, of the file where the audio and/or videocontent is stored. Length The duration of the audio and/or videocontent. AudioChannels The number of audio channels in the audio and/orvideo content. AudioLanguage The language used in the audio of the audioand/or video content.

[0029] Also, the previously existing MediaLocator of MPEG-7 is extendedby adding the InlineMedia as follows: <complexType name =”MediaLocator”> <choice> <sequence> <element name = MediaURL” type =”mds: MediaURL”/> <element name = ”MediaTime type = ”mds:MediaTime”minOccurs = ”0″/> </sequence> <element name = ”MediaTime” type =”mds:MediaTime”/> <element name = ”InlineMedia” type =mds:InlineMedia”/> </choice> </complexType> MediaLocator Example<MediaLocator> <InlineMedia> <FileFormat>mp3</FileFormat><MediaData>98A34F12348942323423AB2342</MediaData> </InlineMedia><MediaLocator>

[0030] An alternative implementation assumes that the media data can beplaced at an arbitrary location in the JPEG2000 files. In this case abyte offset may be used to locate the binary data. In this case, theMediaLocator is alternatively modified as follows: <complexType name =”MediaLocator”> <choice> <sequence> <element name = MediaURL” type =”mds:MediaURL”/> <element name = ”MediaTime type = ”mds:MediaTime”minOccurs = ”0”/> </sequence> <sequence> <element name = ”MediaURL” type= ”mds:MediaURL”/> <element name = ”ByteOffset” type =”nonNegativeInteger” minOccurs = ”0”/> </sequence> <element name =”MediaTime” type = ”mds:MediaTime”/> </choice> </complexType>

[0031] absolute offset within the file, a relative offset, or otherwiseindicating a location within the file.

[0032] Another embodiment of the present invention includes anotherclass of applications, namely, a bounding region of a portion of theimage and associating metadata and information with this boundingregion(s). The information is typically related to the objects (or imageregions) that are defined by the bounding region. The metadata boxand/or the UUID box in the JPEG200 file format may be utilized to storedescriptors and data that define and identify the bounding regions aswell as data associated with the regions, such as object specific URLlinks, voice annotation, and textual annotation. One of manyapplications of such data is user interaction with images where theusers interactively discover and consume information that relate to thecontent of the image.

[0033] While any suitable syntax may be used to define the boundingregion, the bounding region is preferably expressed in XML. Further, theXML is preferably expressed in the form defined by MPEG-7 so that theJPEG2000 file and the MPEG-7 portion are compliant with the respectivestandards.

[0034] Within the MPEG-7 standard the bounding region may be achieved byusing the Still Region Description Scheme. The Still Region DescriptionScheme is derived from the Segment Description Scheme. The SegmentDescription Scheme is used to specify the structure of spatial andtemporal segments of visual data such as images and video in general.Segments can be decomposed into other segments. The Still RegionDescription Scheme is used to specify a spatial type of segment in stillimages or a single video frames.

[0035] The Segment Description Scheme and the Still Region DescriptionScheme may be as follows: <!-- ########################################--> <!-- Definition of “Segment DS” --> <!--######################################### --> <!-- Definition ofdatatype of the decomposition --> <simpleTypename=“DecompositionDataType” base=“string”> <enumerationvalue=“spatial”/> <enumeration value=“temporal”/> <enumerationvalue=“spatio-temporal”/> <enumeration value=“MediaSource”/></simpleType> <!-- Definition of the decomposition --> <complexTypename=“SegmentDecomposition”> <element ref=“Segment” minOccurs=“1”maxOccurs=“unbounded”/> <attribute name=“DecompositionType”type=“mds:DecompositionDataType” use=“required”/> <attributename=“Overlap” type=“boolean” use=“default” value=“false”/> <attributename=“Gap” type=“boolean” use=“default” value=“false”/> </complexType><element name=“Segment” type=“mds:Segment”/> <!-- Definition of theSegment itself --> <complexType name=“Segment” abstract=“true”> <elementname=“MediaInformation” type=“mds:MediaInformation” minOccurs=“0”maxOccurs=“1”/> <element name=“CreationMetaInformation”type=“mds:CreationMetaInformation” minOccurs=“0” maxOccurs=“1”/><element name=“UsageMetaInformation” type=“mds:UsageMetaInformation”minOccurs=“0” maxOccurs=“1”/> <element name=“StructuredAnnotation”type=“mds:StructuredAnnotation” minOccurs=“0” maxOccurs=“unbounded”/><element name=“MatchingHint” type=“mds:MatchingHint” minOccurs=“0”maxOccurs=“unbounded”/> <element name=“PointOfView”type=“mds:PointOfView” minOccurs=“0” maxOccurs=“unbounded”/> <elementname=“SegmentDecomposition” type=“mds:SegmentDecomposition”minOccurs=“0” maxOccurs=“unbounded”/> <attribute name=“id” type=“ID”use=“required”/> <attribute name=“href” type=“uriReference”use=“optimal”/> <attribute name=“idref” type=“IDREF” refType=“Segment”use=“optional”/> </complexType>

Summary of SegmentDecomposition

[0036] SegmentDecomposition Decomposition of a segment into one or moresegments.

[0037] DecompositionDataType Datatype defining the kind of segmentdecomposition. The possible kinds of segment decomposition are spatial,temporal, spatio-temporal, and media source. The bounding regions maybe, for example, spatial segments.

[0038] DecompostionType Attribute, which specifies the decompositiontype of a segment.

[0039] Overlap Boolean, which specifies if the segments resulting from asegment decomposition overlap in time or space. The bounding regions inthe image may overlap.

[0040] Gap Boolean, which specifies if the segments resulting from asegment decomposition leave gaps in time or space.

[0041] Segment Set of segments that form the composition.

Summary of Segment

[0042] Segment Abstract structure which represents a fragment or sectionof the audio and/or video content. For example, a segment may be aregion in an image or a moving region in a video sequence. A segment canbe decomposed into other segments through the SegmentDecomposition. Thismay be used to specify the object's shape, if needed, within a boundingregion, where the outline of the object is specified in terms of adecomposition of the bounding region.

[0043] id Identifier of a video segment. This may be used to uniquelyidentify multiple bounding regions, spatial segments, in an image.

[0044] DecompositionDataType Datatype defining the kind of segmentdecomposition. The possible kinds of segment decomposition are spatial,temporal, spatio-temporal, and media source.

[0045] MediaInformation Media information relates to the segment and itsdescendants.

[0046] CreationMetalnformation Creation Meta Information realtes to thesegment and its descendants. This may be used to associate data withsegments, such as URL, audio files, etc.

[0047] UsageMetaInformation Usage Meta Information relates to thesegment and its descendants.

[0048] SegmentDecomposition Decomposition of the segment intosub-segments.

[0049] Annotation Textual annotation and description of people, animals,objects, actions, places, time, and/or purpose which are instantiated inthe segment. This may be used to associate textual annotations with thebounding regions. <!-- ####################################### --> <!--Definition of “StillRegion DS” --> <!--######################################## --> <element name=“StillRegion”type=“mds:StillRegion” equivClass=“Segment”/> <complexTypename=StillRegion” base=“mds:Segment” derivedBy=“extension”> <elementref=“ColorSpace” minOccurs=“0” maxOccurs=“1”/> <elementref=“ColorQuantization” minOccurs=“0” maxOccurs=“1”/> <elementref=“DominantColor” minOccurs=“0” maxOccurs=“1”/> <elementref=“ColorHistogram” minOccurs=“0” maxOccurs=“1”/> <elementref=“BoundingBox” minOccurs=“0” maxOccurs=“1”/> <elementref=“RegionShape” minOccurs=“0” maxOccurs=“1”/> <elementref=“ContourShape” minOccurs=“0” maxOccurs=“1”/> <elementref=“ColorStructureHistogram” minOccurs=“0” maxOccurs=“1”/> <elementref=“ColorLayout” minOccurs=“0” maxOccurs=“1”/> <elementref=“CompactColor” minOccurs=“0” maxOccurs=“1”/> <elementref=“HomogeneousTexture” minOccurs=“0” maxOccurs=“1”/> <elementref=“TextureBrowsing” minOccurs=“0” maxOccurs=“1”/> <elementref=“EdgeHistogram” minOccurs=“0” maxOccurs=“1”/> <elementref=“SpatialConnectivity” type=“boolean” use=“required”/> <!--Restriction of refType to StillRegion DS --> <attribute name=“idref”type=“IDREF” refType=“StillRegion” use=“optional”/> </complexType>

StillRegion Summary

[0050] StillRegion Set of pixels from an image or a frame in a videosequence. It is noted that no motion information should be used todescribe a still region. Still image can be natural image or syntheticimages. A still image is a particular case of a still region. The pixelsdo not need to be connected (see the SpatialConnectivity attribute).

[0051] SpatialConnectivity Boolean which specifies if a still region isconnected in space, i.e. connected pixels.

[0052] ColorSpace Description of the color space used for the color ofthe still region.

[0053] ColorQuantization Description of the color quantization used forthe color of the still region.

[0054] DominantColor Description of the dominant color of the stillregion.

[0055] ColorHistogram Description of the color histogram of the region.This may be used to embed a low-level color description to boundingregions, when desired.

[0056] BoundingBox Description of a bounding region containing theregion. This is used to describe the bounding region as a region, suchas a rectangular region.

[0057] Using the aforementioned specification the bounding region in aJPEG2000 image may be described as spatial segments and the descriptorBoundingBox may be used to define the locations and dimensions ofbounding region(s), and each region is identified by an id, which ispreferably unique.

[0058] Embedding of textual information, such as annotations, may beimplemented by the structured annotation description scheme. Eachsegment can reference the structured annotation description schemeindividually and at multiplicities identified by their correspondingidentifiers. The StructuredAnnotation Description Scheme may be asfollows: <!-- ######################################## --> <!--Definition of StructuredAnnotation DS --> <!--######################################### --> <elementname=“TextAnnotation” type=“mds:TextualDescription”/> <elementname=“structuredAnnotation” type=“mds:StructuredAnnotation”/><complexType name=“StructuredAnnotation”type=“mds:StructuredAnnotation”/> <complexTypename=“StructuredAnnotation”> <element name=“Who”type=“mds:ControlledTerm” minOccurs=“0”/> <element name=“WhatObject”type=“mds:ControlledTerm” minOccurs=“0”/> <element name=“WhatAction”type=“mds:ControlledTerm” minOccurs=“0”/> <element name=“Where”type=“mds:ControlledTerm” minOccurs=“0”/> <element name=“When”type=“mds:ControlledTerm” minOccurs=“0”/> <element name=“Why”type=“mds:ControlledTerm” minOccurs=“0”/> <element name=“TextAnnotation”type=“mds:TextualDescription” minOccurs=“0”/> <attribute name=“id”type=‘ID”/> <attribute ref=“xml:lang”/> </complexType>

StructuredAnnotation Summary

[0059] TextAnnotation Free textual annotation.

[0060] StructuredAnnotation Textual free annotation and description ofpeople, animals, objects, actions, places, time, and/or purpose.

[0061] Who Textual description of people and animals. May be from athesaurus or a controlled vocabulary.

[0062] WhatObject Textual description of objects. May be from athesaurus or a controlled vocabulary.

[0063] WhatAction Textual description of actions. May be from athesaurus or a controlled vocabulary.

[0064] Where Textual description of places. May be from a thesaurus or acontrolled vocabulary.

[0065] When Textual description of time. May be from a thesaurus or acontrolled vocabulary.

[0066] Why Textual description of purpose May be from a thesaurus or acontrolled vocabulary.

[0067] Annotation Textual free annotation and description of people,animals, objects, actions, places, time, and/or purpose.

[0068] id Identifier for an instantiation of the StructuredAnnotationDescription Scheme.

[0069] Embedding of universal resource locators (URL's) (identifier forinformation outside of the JPEG2000 file) for each bounding region maybe realized using the RelatedMaterial description. The RelatedMaterialdescription scheme is referenced by the CreationMetaInformation DS. Eachsegment (e.g., each boudnging region) references CreationMetaInformationDS, multiple times, if desired. The RelatedMaterial DS may be specifiedas follows: <!-- ######################################## --> <!--Definition the RelatedMaterial DS --> <!--######################################### --> <DSTypename=“RelatedMaterial”> <attribute name=“id” datatype=“ID”/> <attributename=“Master” datatype=“boolean” default=“true” required=“false”/><DTypeRef name=“MediaType” type=“controlledTerm”/> <DSTypeReftype=“MediaLocator” minOccurs=“0”/> <DSTypeRef type=“MediaInformation”minOccurs=“0”/> <DSTypeRef type=“CreationMetaInformation”minOccurs=“0”/> <DSTypeRef type=“UsageMetaInformation” minOccurs=“0”/></DSType>

RelatedMaterial Summary

[0070] RelatedMaterial Description of the materials containingadditional information about the audio and/or video content.

[0071] Master Boolean attribute that allows to identify if thereferenced related material is the master.

[0072] MediaType The media type of the referenced related material(e.g., web page, audiovisual media, a printed book).

[0073] MediaLocator The locator of the referenced related material.

[0074] MediaInformation The media information description of thereferenced related material.

[0075] CreationMetaInformation The creation meta information descriptionof the referenced related material.

[0076] UsageMetaInformation The usage meta information description ofthe referenced related material.

[0077] In another embodiment the media data may be included in the UUIDbox in the JPEG2000 file. In this embodiment the MPEG-7 descriptionschemes are suitable for use in their previously existing format.Typically the UUID box is implicitly referenced from the metadata boxvia the MediaFormat Description Scheme. The MediaProfile DS and theMediaInformation DS may be as follows: <!--######################################## --> <!-- Definition theMediaProfile DS --> <!-- ######################################### --><DSType name=“MediaProfile”> <attribute name=“id” datatype=“ID”/><DSTypeRef type=“MediaInformation”/> <DSTypeRef type=“MediaFormat”/><DSTypeRef type=“MediaCoding” minOccurs=“0” maxOccurs=“*”/> <DSTypeReftype=“MediaInstance” minOccurs=“0” maxOccurs=“*”/> </DSType>

Summary of MediaProfile

[0078] MediaProfile DS describing one profile of the media beingdescribed.

[0079] id Identification of the instance of the MediaProfiledescription.

[0080] MediaIdentification Identification of the master media profile.

[0081] MediaFormat Description of the storage format of the master mediaprofile.

[0082] MediaCoding Description of the coding parameters of the mastermedia profile.

[0083] MediaInstance Description and the localization of the mastermedia profile. <!-- ##################################### --> <!--Definition the MediaInformation DS --> <!--##################################### --> <DSTypename=“MediaInformation”> <attribute name=“id” datatype=“ID”/> <DSTypeReftype=“MediaProfile” maxOccurs=“*”/> </DSType>

Summary of MediaInformation

[0084] MediaInformation The MediaInformation DS contains one or moreMediaProfile DSs. Each MediaInformation DS is related to one reality.For example, a concert may have been recorded in audio and inaudio-visual media. Afterwards each media may be available in differentformat, e.g., the audio media in CD, and the audio-visual media inMPEG-1, MPEG-2, and MPEG-4. This will imply four MediaProfiles for thesame reality.

[0085] id Identification of the instance of the MediaProfiledescription.

[0086] MediaProfile DS describing one profile of the essence beingdescribed.

[0087] In this embodiment, when the MediaLocator within the RelatedMedia description points at the JPEG2000 file itself via MediaURL, theclient application implicitly knows that the related media is containedin a UUID box within this same file containing the XML box. The UUID isreferenced through Media Format description. The application will thenlocate the UUID box with the matching ID in the file and read itscontents. The format of the audio media (e.g., mp3) that is contained inthe UUID ox may be specified a priori by the owner of the UUID format.The mechanism for referring to the JPEG2000 file itself and the UUIDfrom the XML box is summarized below, suching the existing MPEG-7description schemes and their hierarchical structure:

[0088] -----

[0089] RelatedMaterial

[0090] MediaType

[0091] Audio

[0092] MediaLocator

[0093] URL:JPEG2000 file

[0094] MediaInformation

[0095] MediaProfile

[0096] MediaFormat

[0097] UUID

[0098] ----

[0099] The XML box is equipped by a mechanism to refer to the UUID boxthat contains the data, as described above. A format needs to bespecified for the UUID box in order to organize the data within andassociate the data with different regions and different media types.This format is typically vendor specific and identified by the UUID.

[0100] The following format for the UUID box is one potential example.It assumes that all the embedded data is stored in one single UUID box,provided that the data are within the same file. Data associated withdifferent regions are identified according to their corresponding regionID. Types of data are also specified. The Region Data Length is includedto minimize parsing during navigation amongst different regions as theuser interacts with the image. The media Data Length is included toenable rapid navigation of data embedded within the same region.

[0101] UUID Box Format Comment

[0102] ID The ID of the particular UUID box is specified by theMediaInformation/MediaFormat description referenced in theRelatedMaterial description in the XML box.

[0103] Region ID Matches the ID of the Still Region described by theStillRegion description in the XML box.

[0104] Region Data Length Total length of data associated with thisregion.

[0105] Media Type Media Type corresponds to the value of the MediaTypedescriptor in RelatedMaterial description in the XML box (it may bemapped to a binary code in the UUID box)

[0106] Media Data Length

[0107] Media Data

[0108] . . .

[0109] Media Type

[0110] Media Data Length

[0111] Media Data

[0112] . . .

[0113] Region ID

[0114] Region Data Length

[0115] Media Type

[0116] Media Data Length

[0117] Media Data

[0118] It may be noted that the Region ID in the above table may begeneralized to an “Object ID”. The Object ID may then refer to any XMLobject, i.e., any description that is identified by an ID. In that case,a Person Description may have an audio annotation associated with it, ora Summary Description may have executable software associated with it.MPEG-7 does support identification of XML descriptions using uniqueidentifiers.

Summary of MPEG-7 tools used in the UUID box of JPEG200

[0119] Embedded Information MPEG-7 Tool JPEG2000 File Format Structure

[0120] Bounding Region(s) Still Region DS XML Box

[0121] Textual Annotation Annotation DS XML Box

[0122] URL Link Related Material DS XML Box

[0123] Audio/Voice Annotation Data Related Material DS XML Box:indicates Media Type as “Audio” and contains reference to the UUID Box;contains the audio data.

[0124] Executable Code Related Material DS XML Box: indicates Media Typeas “executable” and contains reference to the ID of the UUID boxcontaining the executable; UUID Box: contains the executable code.

[0125] In a multi-level implementation of the system, a server may firstprovide the client the image data, the bounding regions, and the typeand format of the data associated with the bounding regions. The datathat is of further interest to the user may then be delivered uponuser's request.

[0126] If desired, MPEG-7 compliant data/information may be consideredthe MPEG-7 specification as it exists (or substantially similar) to thedate of filing of this application.

[0127] All the references cited herein are incorporated by reference.

[0128] The terms and expressions that have been employed in theforegoing specification are used as terms of description and not oflimitation, and there is no intention, in the use of such terms andexpressions, of excluding equivalents of the features shown anddescribed or portions thereof, it being recognized that the scope of theinvention is defined and limited only by the claims that follow.

1. A JPEG2000 file comprising: (a) said JPEG2000 file containing aplurality of boxes containing data suitable to render an image; (b) atleast one of said boxes being a metadata box; and (c) includinginformation within said metadata box describing the content of saidimage.
 2. The JPEG2000 file of claim 1 wherein said information is inXML format.
 3. The JPEG2000 file of claim 1 wherein said JPEG2000 fileis compliant with the JPEG2000 standard.
 4. The JPEG2000 file of claim 1wherein said information provides interactivity with said image.
 5. TheJPEG2000 file of claim 4 wherein said interactivity includes providing abounding region of a portion of said image.
 6. The JPEG2000 file ofclaim 5 wherein said bounding region is rectangular.
 7. The JPEG2000file of claim 5 wherein additional information regarding said content isassociated with said bounding region of said image.
 8. The JPEG2000 fileof claim 1 wherein said information includes links to informationexternal to said JPEG2000 file.
 9. The JPEG2000 file of claim 1 whereinsaid information includes voice annotation.
 10. The JPEG2000 file ofclaim 1 wherein said information includes object boundary information.11. The JPEG2000 file of claim 1 wherein said information includestextual information regarding the content of said image free fromcopyright information.
 12. The JPEG2000 file of claim 1 wherein saidinformation is MPEG-7 data.
 13. The JPEG2000 file of claim 12 whereinsaid MPEG-7 data is compliant with the MPEG-7 specification.
 14. TheJPEG2000 file of claim 12 wherein said information includes binary data.15. A JPEG2000 file comprising: (a) said JPEG2000 file containing aplurality of boxes containing data suitable to render an image; (b) atleast one of said boxes being a UUID box; and (c) including informationwithin said UUID box describing the content of said image.
 16. TheJPEG2000 file of claim 15 wherein said information is in XML format. 17.The JPEG2000 file of claim 15 wherein said JPEG2000 file is compliantwith the JPEG2000 standard.
 18. The JPEG2000 file of claim 15 whereinsaid information provides interactivity with said image.
 19. TheJPEG2000 file of claim 18 wherein said interactivity includes providinga bounding region of a portion of said image.
 20. The JPEG2000 file ofclaim 19 wherein said bounding region is rectangular.
 21. The JPEG2000file of claim 19 wherein additional information regarding said contentis associated with said bounding region of said image.
 22. The JPEG2000file of claim 15 wherein said information includes links to informationexternal to said JPEG2000 file.
 23. The JPEG2000 file of claim 15wherein said information includes voice annotation.
 24. The JPEG2000file of claim 15 wherein said information includes object boundaryinformation.
 25. The JPEG2000 file of claim 15 wherein said informationincludes textual information regarding the content of said image freefrom copyright information.
 26. The JPEG2000 file of claim 15 whereinsaid information is MPEG-7 data.
 27. The JPEG2000 file of claim 26wherein said MPEG-7 data is compliant with the MPEG-7 specification. 28.The JPEG2000 file of claim 26 wherein said information includes binarydata.
 29. A JPEG2000 file comprising: (a) said JPEG2000 file beingcompliant with the JPEG2000 specification and containing a plurality ofboxes containing data suitable to render an image; and (b) at least oneof said boxes containing MEPG-7 compliant description schemes.
 30. TheJPEG2000 file of claim 29 further including information within at leastone of a metadata box and a UUID box describing the content of saidimage wherein said information is in XML format.
 31. The JPEG2000 fileof claim 29 wherein said JPEG2000 file includes a metadata box.
 32. TheJPEG2000 file of claim 30 wherein said information providesinteractivity with said image.
 33. The JPEG2000 file of claim 32 whereinsaid interactivity includes providing a bounding region of a portion ofsaid image.
 34. The JPEG2000 file of claim 33 wherein said boundingregion is rectangular.
 35. The JPEG2000 file of claim 33 whereinadditional information regarding said content is associated with saidbounding region of said image.
 36. The JPEG2000 file of claim 30 whereinsaid information includes links to information external to said JPEG2000file.
 37. The JPEG2000 file of claim 30 wherein said informationincludes voice annotation.
 38. The JPEG2000 file of claim 30 whereinsaid information includes object boundary information.
 39. The JPEG2000file of claim 30 wherein said information includes textual informationregarding the content of said image free from copyright information. 40.The JPEG2000 file of claim 29 wherein said MPEG-7 compliant descriptionscheme includes binary data.
 41. A MPEG-7 description scheme comprising:(a) a MPEG-7 description scheme that includes the identification of theformat of at least one of audio and visual media; (b) said descriptionscheme including data for rendering said at least one of said audio andvisual media; and (c) said at least one of said audio and visual mediabeing contained within said description scheme.
 42. The MPEG-7description scheme of claim 41 wherein said description scheme isInlineMedia.
 43. The MPEG-7 description scheme of claim 41 wherein saiddescription scheme includes a choice of two different encoding schemesfor data, namely, base16 and base64.
 44. The MPEG-7 description schemeof claim 43 wherein said base16 is part of an element name MediaData16.45. The MPEG-7 description scheme of claim 43 wherein said base64 ispart of an element name MediaData64.
 46. The MPEG-7 description schemeof claim 41 wherein said data is binary.
 47. A JPEG2000 file comprising:(a) said JPEG2000 file containing a plurality of boxes containing datasuitable to render an image; (b) at least one of said boxes being atleast one of a metadata box and a UUID box; and (c) includinginformation within said at least one of said metadata box and said UUIDbox indicating the location of binary data, within said JPEG2000 fileand not within said at least one of said metadata box and said UUID box,associated with said image.
 48. The JPEG2000 file of claim 47 whereinsaid information is in XML format.
 49. The JPEG2000 file of claim 47wherein said JPEG2000 file is compliant with the JPEG2000 standard. 50.The JPEG2000 file of claim 47 wherein said information providesinteractivity with said image.
 51. The JPEG2000 file of claim 50 whereinsaid interactivity includes providing a bounding region of a portion ofsaid image.
 52. The JPEG2000 file of claim 51 wherein said boundingregion is rectangular.
 53. The JPEG2000 file of claim 51 whereinadditional information regarding said content is associated with saidbounding region of said image.
 54. The JPEG2000 file of claim 47 whereinsaid information includes links to information external to said JPEG2000file.
 55. The JPEG2000 file of claim 47 wherein said binary dataincludes voice annotation.
 56. The JPEG2000 file of claim 47 whereinsaid binary data includes object boundary information.
 57. The JPEG2000file of claim 47 wherein said information includes textual informationregarding the content of said image free from copyright information. 58.The JPEG2000 file of claim 47 wherein said information is MPEG-7 data.59. The JPEG2000 file of claim 58 wherein said MPEG-7 data is compliantwith the MPEG-7 specification.